NIF: An ontology-based and linked-data-aware NLP Interchange Format
نویسندگان
چکیده
We are currently observing a plethora of Natural Language Processing tools and services being made available. Each of the tools and services has its particular strengths and weaknesses, but exploiting the strengths and synergistically combining different tools is currently an extremely cumbersome and time consuming task. Also, once a particular set of tools is integrated this integration is not reusable by others. We argue that simplifying the interoperability of different NLP tools performing similar but also complementary tasks will facilitate the comparability of results and the creation of sophisticated NLP applications. In addition, the synergistic combination of tools might ultimately yield a boost in precision and recall for common NLP tasks. In this paper, we present the NLP Interchange Format (NIF). NIF is based on a Linked Data enabled URI scheme for identifying elements in (hyper-)texts and an ontology for describing common NLP terms and concepts. NIF aware applications will produce output (and possibly also consume input) adhering to the NIF ontology. Other than more centralized solutions such as UIMA and GATE, NIF enables the creation of heterogeneous, distributed and loosely coupled NLP applications, which use the Web as an integration platform. We evaluate the NIF approach by (1) benchmarking the stability of the NIF URI scheme and (2) providing results of a field study, where we integrated 6 diverse NLP tools using NIF wrappers.
منابع مشابه
Linked-Data Aware URI Schemes for Referencing Text Fragments
The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. The motivation behind NIF is to allow NLP tools to exchange annotations about text documents in RDF. Hence, the main prerequisite is that parts of the documents (i.e. strings) are referenceable by URIs, so that the...
متن کاملIntegrating NLP Using Linked Data
We are currently observing a plethora of Natural Language Processing tools and services being made available. Each of the tools and services has its particular strengths and weaknesses, but exploiting the strengths and synergistically combining different tools is currently an extremely cumbersome and time consuming task. Also, once a particular set of tools is integrated, this integration is no...
متن کاملTowards an Ontology for Representing Strings
The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. The motivation behind NIF is to allow NLP tools to exchange annotations about text documents in RDF. Hence, the main prerequisite is that parts of the documents (i.e. strings) are referenceable by URIs, so that the...
متن کاملCorpus Conversion , Parsing and Processing Using the Nlp Interchange Format 2 . 0
This work presents a thorough examination and expansion of the NIF ecosystem. Both core use cases of NIF, as a format for NLP tool integration, as well as a corpus pivot format, are addressed. NLP tool integration is extended with a new wrapper for the OpenNLP framework including a NIF parser, as well as a CoNLL converter to convert the widely used CoNLL format to NIF. Tools are examined for sc...
متن کاملNIF Combinator: Combining NLP Tool Output
The NLP Interchange Format (NIF) is an RDF/OWL-based format that provides interoperability between Natural Language Processing (NLP) tools, language resources and annotations by allowing NLP tools to exchange annotations about text documents in RDF. Other than more centralized solutions such as UIMA and GATE, NIF enables the creation of heterogeneous, distributed and loosely coupled NLP applica...
متن کامل